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Abstract 

We consider the single-source (or single-sink) buy-at-bulk problem with an unknown concave cost 
function. We want to route a set of demands along a graph to or from a designated root node, and the cost 
of routing x units of flow along an edge is proportional to some concave, non-decreasing function / such 
that /(O) = 0. We present a polynomial time algorithm that finds a distribution over trees such that the 
expected cost of a tree for any / is within an 0(l)-factor of the optimum cost for that /. The previous 
best simultaneous approximation for this problem, even ignoring computation time, was 0(log 
where T> is the multi-set of demand nodes. 

We design a simple algorithmic framework using the ellipsoid method that finds an O ( 1 ) -approximation 
if one exists, and then construct a separation oracle using a novel adaptation of the Guha, Meyerson, and 
Munagala P GMMOlll algorithm for the single-sink buy-at-bulk problem that proves an 0(1) approxi- 
mation is possible for all /. The number of trees in the support of the distribution constructed by our 
algorithm is at most 1 + log \ V\. 
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1 Introduction 



We study the single-source (or single-sink) buy-at-bulk network design problem with an unknown concave 
cost function. We are given an undirected graph G = {V, E) with edge lengths /g and a set of demand nodes 
T) QV with integer demands and want to route these demands to a designated root node r as cheaply as 
possible, where the cost of routing along a particular edge is proportional to some function / of the amount 
of flow sent along the edge. In many applications it is natural to assume that / is a concave, non-decreasing 
function such that /(O) = 0, capturing the case where we benefit from some kind of economy of scale 
when aggregating flows together. We call such functions aggregation functions and define as the set of 
all aggregation functions. 

When the function / is given, the problem becomes the well-studied single-sink buy-at-bulk (SSBaB) 
problem. SSBaB is A^P-hard, since it contains the Steiner tree problem as a special case. The problem 
was introduced by Salman et al. IISCRS97II who gave algorithms for special cases. Awerbuch and Azar 
IIAA97I gave an 0(log^ n)-approximation using metric tree embedding, which subsequently improved to 
0(log n) using better metric embeddings |.Bar98 , FRT03J . Building on their own work on hierarchical facil- 
ity location IIGMMOOII . Guha, Meyerson, and Munagala (GMM) gave the first constant-factor approximation 
MGMMOll . an algorithm that features prominentiy in our results. Recent work IITal021 IGKR031 IJR04[ IGI06II 
has reduced the approximation ratio to 24.92 and also provided an elegant cost-sharing framework for think- 
ing about this problem. 

However, for some applications we may want to assume that / is unknown or is known to vary over time. 
For instance, we may be aggregating observations in a sensor network where we do not know the amount 
of redundancy among different observations or where the redundancy is known to change. In this setting, 
it is desirable to find a solution that is robust to changes in / and provides a constant-factor approximation 
simultaneously for all f ^ T . Moreover, from a purely theoretical perspective, the existence of a good 
algorithm that is independent of / reveals non-trivial structure in the problem. 

We will focus on randomized algorithms. Given the concavity of /, we may assume without loss of 
generality that the optimal routing graph is a tree. Let T be the set of all trees in G spanning V and r, and 
let Tj be the optimal tree for some fixed /. We use the shorthand /(T) to denote the cost of T under /, i.e. 

lef{xT,e) where XT,e is the amount of flow tree T routes on edge e. There are two natural objectives 
which capture simultaneous approximation for multiple cost functions. First, we can try to minimize 

M2! ,1) 

/(!>•) 

which essentially gives a distribution over trees such that in expectation, each function / is well-approximated. 
Second, and much more difficult, we can look for an algorithm that uses the objective 



i?2 = E 



f{T) 

max ■ 



(2) 



A bound on Q subsumes ([T]) and proves there exists a single tree that is simultaneously good for all /. We 
call Ri the oblivious approximation ratio and R2 the simultaneous approximation ratio. In this paper, we 
will work with the weaker, oblivious objective ([T])- 

Both objectives have been studied in the literature. The tree embeddings used by Awerbuch and Azar 
IIAA97I give an 0(log^ n) oblivious approximation, which was later reduced to 0(log n) ||Bar981lFRT03i 
Goel and Estrin IIGE031 improved this to 0(log and also prove the same bound on the stronger simul- 
taneous objective. Gupta et al. [iGHR06l achieve a 0(log^ n) oblivious approximation for a generalization 
where both the function and the demands aie unknown. Khuller et al. [KRY95] studied special case of si- 
multaneously approximating f{x) = x and f{x) = 1 for x > 1, i.e. the shortest-path and Steiner trees, and 
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prove an 0(1) simultaneous approximation. These 2 functions constitute opposite extremes of functions 
in T, and one may wonder if an 0(1) approximation for these 2 functions also works for all / G lying 
"in-between". However, it is not difficult to construct a graph and a set of demands such that the shortest- 
path and Steiner trees are identical, but this tree is an u;(l)-approximation for other / € JF. Enachescu et 
al. IIEGGM05II achieve an 0(1) simultaneous value but only for grid graphs, assuming spatial correlation 
among nearby nodes. This naturally leads to the following questions: 

Is i?i = 0(1) achievable? If yes, is there a polynomial algorithm that guarantees R\ = 0(1)? 

We answer both questions in the affirmative. We first write a simple LP formulation of the problem 
and show that using the ellipsoid method on the dual we can find an 0(1) approximation to the optimal 
ratio, whatever it happens to be for a given problem instance. We also show that given an appropriate 
separation oracle the optimum is constant and compute an explicit distribution over 1 + [log d^)] trees 
in polynomial time. This general approach is along the lines of small metric tree embeddings iCCG"'"98 1 
and oblivious congestion minimization [RacOSII . 

Our key result is the construction of the necessary separation oracle subroutine, running in polynomial 
time, that proves a constant is achievable. We build our oracle around the GMM algorithm for SSBaB, 
using a modified analysis to solve a different problem in which we bound the cost of the GMM tree by a 
combination of different trees under different cost functions. 



1.1 Organization of the Paper 

In Section |2] we present an LP formulation and a framework using an approximate separation oracle that 
finds a constant-factor approximation to the optimal oblivious approximation ratio. In Section |3] we present 
our primary result, which proves the oblivious approximation ratio is constant and constructs the separation 
oracle required by Section |2] assuming some extra conditions on the input, and in Section |4] we complete 
the proof by showing those extra assumptions can be removed. We conclude with some open problems 
(including whether R2 = 0(1) can be achieved). 



2 LP Formulation and Algorithm Framework 

Let Ri be the worst-case optimal obUvious ratio, i.e. 

. ET^^[/(r)] 

Ri = max mm max ; — 

G,l,V,r M f f{T*j) 

where is a distribution over T. In this section we discuss the problem of finding an 0(l)-oblivious 
approximation if one exists. 

By losing a factor of 2 in the approximation ratio we can restrict our analysis to a smaller class of 
aggregation functions. Let D = 2r'°s(5Zu , the total amount of demand rounded up to the nearest power of 
2. We never route more than D flow on any edge, and is integral, so we only care about f{x) for integers 
< X < £>. Suppose / G J^, and 2* < X < 2*+^ By the monotonicity of /, /(2*) < f{x) < f{2'+^), and 
by the concavity of /, /(2*+^) < 2/(2'), so with a loss of a factor of 2 we can interpolate between /(2*) and 
/(2*+^) and assume / is piecewise linear with breakpoints only at powers of 2. Let Ai{x) = min{x, 2*} and 
T* the optimal aggregation tree for Ai. We call Ai{x) the i-th atomic function following the terminology 
of Goel and Estrin [GE03 |, and it is easy to see that any / G ^ that is linear between successive powers of 
2 can be written as a linear combination of {^j}o<i<iogD- Therefore, it suffices to design an algorithm A 
minimizing maxj E_A[Ai{TA)]/Ai{T*). 
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Our algorithm makes use of the standard SSBaB problem where / is known. We assume that / is given 
in the form of a set of K pipes {{(Tk, 5k)}Q<k<K-i, where the cost of routing x flow on pipe k is equal to 
(Tfc + x5k- Then f{x) is defined as the cost of using the cheapest pipe for x flow: miiifc au + x5k- We assume 
that o"o < o"! < • • • < (JR-i, and by concavity we can assume 5q > 5i > ■ ■ ■ > 6k-i- Define Uk = t^, the 
point at which the cost due to S^x begins to outweigh the cost due to a^- We call the capacity of pipe k; 
the name arises from an alternate formulation (equivalent up to a factor of 2) of SSBaB where pipes have a 
fixed cost (jfc for a fixed capacity u^. Let TiBaB be the best-known approximation ratio for SSBaB. Currently 
T^BaB = 24.92 using an algorithm by Grandoni and Italiano [GI06|. 

We also employ an approximation algorithm for a special case of SSBaB, the single-sink rent-or-buy 
(SSRoB) problem. Here f{x) is characterized by 2 pipes: (0, 1) and (M, 0), i.e. we can pay x to route 
X flow or pay M to route any amount of flow. Let -kroB be the best-known SSRoB approximation ratio. 
Eisenbrand et al. |EGRS08| give a 2.92-approximation. 

If we can calculate Ai{T) and Ai{T*) for every i and T ^ T then the following linear program finds 
the optimal distribution of trees. 

mill 9 

s.t. Y^Ter^T 

VO<i<logZ), 9Ai{T*)-ZTer^TMT) 

x,e 

In other words, we want a distribution {xT^TeT of trees minimizing maxj ^ ■ However, this 

approach is not directly tractable, as T* is A^P-hard to find, and |T| is exponentially large. 

We solve an SSRoB approximation for each Ai to get Ai{Ti) — a vr/jo^-approximation — and replace 
Ai{T*) with Ai{Ti) in the constraints, so that all quantities in the LP are polynomial-time computable. Now 
consider the dual of (|3]l, which is given by 

max (3 

s.t. E!°io''«^^^m) <i .4. 

VTeT ^i-Y.'S^''o^^MT) <o 

a, (5 > 

With an approximate separation oracle for the dual we can approximate the solution in polynomial 
time using the ellipsoid method, and then transform it into an approximate solution to the primal (O. More 
formally: 

Theorem 2.1. With a randomized tt BaB-cippfoximation to SSBaB, we can find a Itt bobt^ BaBRi-<^PPfoximation 
in expectation to the primal LP ^ that runs in polynomial time with high-probability. 

The proof uses a SSBaB approximation algorithm to construct an approximate separation oracle for (01). 
However, we will not prove this theorem because it is a special case of the following more general result, 
assuming that Ri is a constant which will follow from Theorem 13. 8 1 

Theorem 2.2. If there exists a polynomial-time algorithm A and a given constant c such that V qq, • • • , (Xk-i ^ 
0, A finds T4 such that E_a [X^j ^^^^(TU)] < c'Y^aiAi[T*) then we can construct an algorithm that 
runs in polynomial-time with high probability, makes O (poly (log D)) calls to A with high probability, and 
achieves an expected oblivious approximation ratio of 2cTTjioB using a distribution over 1 + log D trees. 

Proving that such an algorithm A exists for a constant c is the primary result of this paper and is discussed 
in sections [3] and m 



> 1 

> 

> 



(3) 
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Remark 2.3. If A is deterministic then the algorithm always runs in polynomial time and the expected ratio 
is CTiRoB, cmd if it is randomized then the algorithm runs in polynomial time with high probability and the 
expected ratio is 2c'KroB- For randomized A the ratio can also be reduced to (1 + e)cTTjioB with a ^-factor 
increase in the runtime. 

Proof of Theorem |Z2l Let Ai{Ti) be a vr/jo^ -approximation to Ai{T*) as above. We construct an approxi- 
mate separation oracle S{a, (3) for the dual (O as follows: 

1. Check if J2i ctiAi{Ti) > 1. If so, we have a violated constraint and are done. 

2. Run A{a) until it returns a tree T such that aiAi{T) < 2c^. aiAi{fi). 

3. If aiAi{T) < P, return T. Otherwise, return feasible. 

For a fixed (3, let be the polytope defined by aiAi{fi) < 1, and (3 - aiAi{T) < for all 
T G T. We run the following procedure to find the desired distribution of trees: 

1. Run the ellipsoid method to check the feasibility of V2c, starting with the initial bounding box < 
aj < 1 Vi and using S as the separation oracle. It will terminate as infeasible. 

2. Let C be the set of constraints returned by S proving V2c is infeasible. It consists of Yl^°^o^ CiiAi{Ti) < 
1, and 2c - Ei=o^ aiAi{T) < for T in some subset of trees T'. 

3. In the dual LP (2), restrict the constraints to C, and take the dual to get 

min 6 

s.t. EreT' XT > 1 ... 

VO < i < log I), eAi{f,) - xtA{T) > ^ ^ 

x,9 > 

4. Find a vertex optimal solution to and return the distribution {xfp}. 

First, we claim that S{a, j3) will find a violated constraint whenever [5 >2c and will do so in polynomial 
time with high probability. If aiAi{Ti) < 1 is violated, then we are done. If not, we know A{d) finds 
r4 such that 



E.4 



Y,aiAi{Tj^) 



By Markov's inequahty Pr^ ctiAi{Tj^) > 2c Y^- aiAi{fi)j < \, so with high probability 0(log n) in- 
vocations of A — each running in polynomial time — suffice in step[2]of S to find a T satisfying aiAi{T) < 
2c ^. aiAi{fi). Now if (3 > 2c, the constraint (3-J2i ^iMT) < is violated. 

With the necessary separation oracle, the ellipsoid algorithm can solve feasibility of "P^ in O (poly (log D)) 
iterations, so using S it will conclude V2c is infeasibleQ. The set of constraints C returned by S dur- 
ing the execution constitutes a proof of infeasibility, and C consists of Yl^°^o^ aiAi{fi) < 1, and P — 
E!=o^ OiiAi (T) < for each T in some set of trees T'. 

Consider writing (jUl with only the constraints in C. Taking the dual yields which only has variables 
XT for T € T'. The ellipsoid algorithm concluded 1^20 is infeasible after O (poly (log Z))) iterations, so \T'\ 
is only polynomially-large in the input size, implying we can solve ^ exactly in polynomial time. 



'in practice A may find violated constraints for (3 < 2c, and we can do binary search to find the smallest infeasible f3. However, 
we cannot improve the provable guarantee beyond /3 = c, and this comes at a cost to the runtime. 
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Find a vertex-optimal solution 9* , to The constraints in C are enough to restrict the optimal dual 
objective to be at most 2c, so by duality 6* < 2c. Therefore, for all i 

x*TAi{T) < e*Ai{fi) < 2cMf^) < 2c7rRoBAiiT*) 

TeT' 

Divide by Ai{T*) to get the oblivious ratio: 

™ax — , , < 2cttroB 

Moreover, we claim {x^^} is a distiibution over only 1 + log D trees. The LP ^ has |T'| + 1 variables 
and 2 + \ogD constraints, and the vertex-optimal solution 6*,x^ must have \T'\ + 1 tight constraints, 
implying at least \T'\— log D — \ non-negativity constraints must be tight. We know 9* is positive, so only 
at most 1 + log D of the variables xt can be non-zero. □ 



3 The Separation Oracle Subroutine A 

By Theorem 12. II we can find an 0(l)-approximation to Ri, whatever it may be, but it remains to prove that 
this optimal ratio is a constant. In this section we construct the procedure A required by Theorem 12.21 using 
the GMM algorithm for SSBaB. 

Our contribution is adapting a special case of the analysis of the GMM algorithm, namely those cases 
that arise when f{x) = aiAi{x), to solve a different problem-that of bounding the cost of the output 
by Ylii (^iAi{T*) rather than f{T^). The GMM algorithm and proof works in stages and bounds the cost of 
the pipes laid in each stage by a different chunk of the optimal tree TJ. On the other hand, in our proof we 
bound the cost of each stage by the cost of a different tree evaluated under a different cost function. 

3.1 Background: The GMM Algorithm 

For completeness, we summarize the GMM algorithm and the key lemmas and definitions. See the original 
paper HGMMOll for a thorough treatment. We are given a graph, demands V, and pipes {(cjfc, (^A,)}fcg[ii:] as 
described in Section [2l We assume the costs of successive pipes differ "significantly": for some constant 7 
such that < 7 < ^, we have that 6k+i < jSk and at < "fcJk+i- For the SSBaB problem, it is easy to 
satisfy these constraints for arbitrary pipes with only an 0(l)-factor loss. For our problem, it is harder but 
still possible, and this is discussed in Section IH 

We define as the indifference point between pipe k and k + I, which is the solution to the equation 
o-fc + Skgk = o-fe+i + 6k+igk, and we define bk as the solution to crfc+i + = 27(0-^ + 6kbk), which 

we interpret as the point at which pipe k + 1 becomes "significantly" cheaper than pipe k. It is easy to see 
that Uk <bk < Uk+i for all k. 

The algorithm uses 0(l)-approximations for Steiner tree and load-balanced facility location (LBFL), a 
generalization of the standard facility location problem. In the LBFL problem we have a graph and demands 
as in SSBaB, a facility cost Fy for each node v, and a lower bound on the demand that a facility at v must 
service. The objective is to choose facilities and routing paths so as to minimize the sum of the cost of the 
open facilities and the distances traveled by the demands to a servicing facility. To approximate the LBFL 
we must relax the lower bound. Using I GMMOO.I we can approximate the optimal LBFL cost to within 2itf 
while reducing the lower bound by a factor of at most 3. Here irp denotes the best approximation to the 
normal facility location problem, currently np = 1.52 by Mahdian et al. IIMYZ02II . We use ns to denote 
the best approximation ratio for Steiner tree, currently 1.55 due to Robins and Zelikovsky IIRZOOl . 
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Now we can describe the GMM algorithm itself. At stage k, we lay pipe type k, and we break each stage 
into a Steiner tree step and a "shortest-path" tree step based on whether the cost of pipe k is dominated by 
the term ak or the term 6kX. The effective demands will also change each stage. Let V^'^'^ be the demand 
nodes at the start of stage k, and di^^ the stage k demand at w € D^^). Initially V^^^ = V. 

1. Steiner Tree: Find a 7r5-approximate Steiner tree on V^^^ U {r} with edge cost per unit length cj/^.. 

Route all demands toward r. Cut the farthest-upstream edge with more than flow, recalculate the 
flow, and repeat to get a forest with at least Uk flow at each root other than r and at most flow on 
each edge. 

2. Consolidation: Let t be a subtree not containing r and St the demand nodes in V^'^^ it contains. Choose 

V G St with probability ^ — ^ and route all demand in t back to v using pipe k. 

3. Shortest Path Tree: Approximately solve a LBFL problem with facility lower bound bk and edge cost 

per unit length dj. on the original demands V (not T>^^^ and di'^'^). This creates a forest of shortest-path 
trees with at least flow at each root. If b^ demand does not exist, route everything to r. 

4. Consolidation: Let t be subtree in the above forest servicing the demands St in V. Choose v G St with 

probability s-^ j , and route the true, current demand di''^ in St back to v. Let be the set of 

nodes chosen for consolidation and d^v^^^ the demand at these nodes after consolidation. 

Next, we mention the crucial lemmas in the GMM analysis used in our proof. See BGMMOII for the 
proofs. 

Lemma 3.1 (GMM Lemma 4.1). Let dy be the current demand at some v G immediately after any 
consolidation step. Then E[dy] = d^, i.e. the original demand. 

Using an algorithm that is a 3-approximation to the LBFL facility lower bounds, we have the following: 
Lemma 3.2 (GMM Lemma 4.5). For every v £ V^^\ we have E[di''^] > 

Define P| to be the incremental cost (due to 6) of the pipes laid in the facility location step in stage k 
and to be the fixed cost (due to a) of the pipes laid in the Steiner tree step in stage k. All of the other 
costs incurred by the GMM algorithm can be bounded by and P^, so our analysis need only consider 
these quantities: 

Lemma 3.3 (GMM Lemmas 4.2, 4.4, and 4.8). Let P^ and P^ as defined above. Then E[f{TGMM)] < 
4 Y^k^i^k + -^fc ]' w^^^^^ Tgmm is the final tree. 

3.2 Adapting the GMM Algorithm 

From Theorem 12.21 we are given a such that Ui > 0, and J2i < 1. We want to find a tree T using 

the GMM algorithm such that YiaiAi{T) < c^iaiAiiT*). Define L = the multi-level 

cost, and f{x) = Y^- aiAi{x), the concave cost function. Using this notation our objective becomes to find 
T such that f{T) < cL. Define K as the number of non-zero a^, and for < A; < — 1 define p{k) = j 
where j is the index of the A;-th non-zero a^. 

First, we claim that given a we can define the pipes {{ak,Sk)} used by the GMM algorithm, and given 
SSBaB pipes satisfying some minor conditions we can recover a. The following lemmas characterize the 
equivalence between the 2 types of parameters: 
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Lemma 3.4. Given a satisfying > with K non-zero ai, the SSBaB pipes {{o'k,5k)}o<k<K defined 
by 5k = Ylij>k'^p(j) <^nd Uk = Yl,j<k^p{j)'^^'^^'^ define the fiinction f{x). That is, f{x) = ^jaj^i(x) = 
mmfc{(TA; + 5kx}. 

Lemma 3.5. Suppose we are given K + 1 SSBaB pipes {{o'k,dk)}o<k<K such that ctq = and gk is a 
power of 2 for all k. For < k < K — 1, let p{k) = log gk, ap(fc) = ^k ~ ^k+h '^^d aj = whenever 
j ^ p(k) for all k. Then ^ • aiAi{x) = mmk{crk + dkx}. 

Proof of Lemma \3A\ By definition f{x) = Ylk '^p{k)^pik){x)- For any k, f{x) is linear from 2*'^'^"^) to 
2P(fc) (we will assume 2^^ ^) = for consistency of notation), which will correspond to pipe k. For x G 
ppCfc-i)^ 2P('^)], the functions Ap(o)(x), . . . , Ap(^k-i)ix) have leveled off, and Ap(^k){x), ■ ■ ■ , Ap(^x-i){x) are 
growing at rate 1. Define 6k as the slope of f{x) in this interval: 6k = J2j>k ^p{j)- 
Now we can define ak to match /(x) in the interval [2p('^'~^), 2^^'^^]: 

i j<k j'>k 

= ^ap(,)2P(-'-) + <5fc2P(^-i) 

j<k 

j<k 

We also add a + 1st pipe such that 5k = ^ and uk = Ylk Q^p(fc)2^^'^^ to cover the interval after every 
Ap(^k) has leveled off. Now, we claim /(x) = minj{(Tj + 5jx}: for each k we know f{x) = (Tk + 6kX 
whenever x E 2^^'^)] by our choice of 6k and ak, and by the concavity of /(x) for each j we have 

Uj + 6jX > /(x) when x < 2^^^'^^ or x > 2^^^\ Therefore no other pipe can be cheaper in this interval. 
Concavity also ensures that cjk < o'fc_|_i and 6k > 6k+i for all k, yielding valid SSBaB pipes. □ 

Proof of Lemma l?31 Let K + 1 be the number of pipes, and 6q > ■ ■ ■ > 6k, = ctq < • • • < gk- Since we 
never route more than D flow we may assume the cost function levels off at some x < Z), so that 6k = 0. 
Define p{k) = log gk fox Q < k < K — 1: when we change pipes at gk the slope of /(x) drops, which can 
occur only because the term ap(fc)^p(fc)(x) levels off. Recover ap(k) by reversing the definitions in the proof 
of Lemma [34] we have 6k = Ylij>k '^pU)' so for /c < — 1 let ap(fe) = 6k — 6k+i- 

We now show by induction that ap(/c)^p(fc)(x) = mmj{aj + 6jx}. For the base case x € [0, go], 
we have 

K-l 

mm{aj + 5jx} = 6ox = {6o - 6k)x = ^ (4 - 4+i)x = Eap(fe)X = ^ ap(fc)^p(fc)(x) 

k=0 k k 

Now assume that for x G [0, gi-i] that J2k '^p{k)Ap{k){^) = minj{(Tj + 6jx}. For x e {gi-i,gi], we know 
that /(x) = fjj + (5jX. Therefore, 

ai + 5,x = ((T,_i + 5,-12^(^-1)) + 5,(x - 2P^'-'^) 

K-l 

= ^ap(,)Ap(fe)(2*'(-i)) + 5] - 6k+i){x - 2*^(^-1)) 

= X] o:p(^k)Ap{k){x) 
k 
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We use that pipes i — 1 and i have equal cost at gi^i in the first line and the induction hypothesis in the 
second line. □ 



We note that ap(fc) corresponds not to a particular SSBaB pipe, but to a breakpoint between pipes: when 
we switch from pipe /c to A; + 1 at 2^^^) flow, the slope of / drops from 6^ to 6k+i, which is caused by the 
term ap(^k)Ap{k){x) levehng off. 

Given the above equivalence, we will use a and {{a-k,6k)}k interchangeably for the remainder of the 
paper, using whichever representation is more convenient and converting from one form to another using 
Lemmas 13.41 and 13.51 However, the additional constraints that for some parameter < 7 < | we have 
(5fc+i < j6k and ak < TCfc+i for all pipes k, will restrict the possible vectors a that can be run through the 
algorithm: 

Definition 3.6. Call a 7-regular if the pipes found using Lemma satisfy 5k+i < "ySk and ak < 7<5a;+i- 

We note the following constraints that 7-regularity imposes on a: 
Lemma 3.7. < 7'^fc. then ap(fc) > (1 - 7)4 and ap^^j.) > ^ap(fe+i)- 

Proof. These follow immediately from ap(^k) = — ^k+i and 5k+i < l^k- D 
3.3 Approximation guarantee assuming regular a. 

We will first prove the existence of the separation oracle procedure A in Theorem 12.21 for 7-regular a and 
later prove in Section|4]that arbitrary a can be regularized with only an 0(1) change in f{x) and L: 

Theorem 3.8. Let a be '^-regular, and let f{x) = aiAi{x), and L = Y^.- aiAi{T^). Then the GMM 
algorithm finds a tree Tgmm such that E [/(Tga/a/)] = 0{L). 

Roughly, our proof bounds the cost of the pipes laid in phase k of the algorithm by ap(^k)^p{k){T*(^i^-^). 
Using Lemma [331 we concentrate on Pf. and and ignore the other costs. First, we bound the cost of the 
Steiner tree steps: 

Lemma 3.9. Let tts be the approximation ratio for Steiner tree. Then we have 'Ylk^\-^k \ — T^^- 

Proof. We need to bound the cost of a Steiner tree spanning the current demands V^^^ with cost per unit 
length (Tfc. If A; = 0, then Ufc = and we have nothing to bound, so assume A; > 0. 

We use the edges in T*^k_i) - Note that it spans V U {r} and hence V^'^'i U {r}, and let Wk ^ T*f^k-i) 

the subset of edges spanning these nodes. By Lemma [l!2] each v G T>^^^ has aggregated at least E[di'^^] > 
demand. At the end of the previous LBFL phase, we chose a node v for consolidation from the set of 
all u routing to facility / with probability y,*^" d — ^dg^ is in Wk only if some v € V^'^^ routes 

through it, so by the union bound an edge carrying x* demand in T*^i,_^-^ is in Wk with probability at most 

3xt 



bk-i ' 

The tree Wk pays ak for any amount of flow, whereas T^j.^-^^ pays Ap(^k-i){^l) = min{2P('^^^), x*} to 
send x* flow on e. Then the cost of Wk is 

A ( * ^ 

nWk] = akJ2Pv[e G Wk]le = ^kY^^^e G Wk]l ^^^-^^^''^^ 



e:a;;<2P(''-l) e:xi>2P(fc-l) 



min{x*,2P(*=-i)} 

^(x*A 



e;x*<2P('=-i) e:z*>2P('=-i) 
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We need to bound and ^^^ny- For the former term, 

Q-fc _ Q-fc (274-1 - h) ^ o-fc(27(5fc-i - 6k) ^ 27(4-1 - 6k) _ ap{k~i) 
bk-i ak-2'yak-i ~ 270-^(1-7) ~ 27(1-7) I-7 

using that hk-i = 274^T-^fc definition, the 7-regularity constraints on Ok-i, and the fact that 27 < 1. 
For the latter term, 

(Tk _ + ap(fc-i)2P(^-^) jak + ap(fc-i)2P(^-^) _ gfc 
2p(fc-i) ~ 2P(fc-i) - 2P(^-i) -^2P(fc-i) ^^^^"^^ 

using the formula for ak in Lemma |34] and 7-reguIarity. 
Plug these into the final hne in equation ^ above: 

nwk]<^f^l3 Yl Ap{k-i){x:)ie+ Yl ^^ik-DixDh 

^ \ e:x*<2P(*-l) e:x*>2P('=-i) 



^— - j api^k-i)\{k-i){Tp(k-i)) 



■7 

We lose another factor of -ks in approximating the Steiner tree. Sum over all k to bound E[P^] by 

1-7 

Analyzing the LBFL step requires an additional lemma bounding the difference between gk and bk '■ 
Lemma 3.10. For every k, gk <bk < ^—^p—gk- 



1 

Proof. The bound gk < bk follows from Lemma 3.5 in GMM KGMMOIH . For the other inequality, from the 
definition of bk and gk we have 

Cfe+i-o"fc cjfc+i-27cjfc bk ak+i-2^ak 6k - 6k+i 

gk = J bk- ^ - 



6k - 6k+i 2j6k - 6k+i gk cfk+i- crk '2l6k - 4+i 

For the ratio of a terms, 

CTfc+l - 27(Tfc (Tfc+i-(Tfc , „ N <^k 

= h (1 - 27) 

ffc+l— CTA: ffc+l— <7fc CTfc+l— CTfc 

Gk ^ , 7-272 1-272 



Similarly, for the (5s, 



<l + (l-27)^ \ = 1+ ^ - ^ 



4 - 4+1 274 - 4+1 , o ^ ^k 

- +(1-27)- 



274 - 4+1 274 - 4+1 274 - 4+1 

<l + (l-27)7^^ = ^ 
(27 - 7)4 7 

Combining the 2 bounds, 

6fe ^ 1 - 272 1 - 7 1-272 



Qk 1-77 7 

□ 
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Now we can bound the LBFL cost E[P|]: 

Lemma 3.11. We have that Y.k^[Pk\ ^ 2-kf^^^L where TTp is the approximation ratio for the standard 
( non-load-balanced) facility location problem. 

Proof In the shortest path tree step, the GMM algorithm solves an LBFL problem on the original demands 
P with facility lower bound and edge cost per unit length 5k- We will construct a feasible solution using 
the edges of Orient the edges towards r, and find the farthest upstream (i.e. away from r) edge routing 
at least flow. Cut the edge, and place a facility at the upstream node. Subtract this flow from downstream 
edges, and repeat the procedure. If we finish with less than flow at the root node, we route each demand 
still reaching the root from its source vertex along the tree to the nearest existing facility (according to 
distances in T*^^^-^). Let be the resulting forest, and note that it has at least b^ flow at each facility. 

For an edge e let Xe be the amount routes on e when the demands V are routed, and x* the amount 
that T*^^^ routes on e. We now show that Xe < x*. If we finish cutting T*i^^~^ with at least at the root then 
all flows are a subset of the flows in T*^^^ so Xe < x*. If we end up with too little demand for a facility in 
the final step then some of those demands will not be flowing downstream towards r in F^. For each edge 
they take towards r, they are following the routing in T*^j^y so Xg < x*. For each e edge taken away from r, 
we are no longer following T^j^^^, but we must be moving upstream towards the nearest facility. This implies 
that in the tree T*^^^-^ edge e carried more than flow because all demand at the upstream facility flowed 
through e towards r. Since we are sending strictly less than demand upstream we still have Xg < x*. 

The forest F^ never routes more than bk flow, so Xe < bk- When x* < gk, x* = ^p(fc)(x*), so Xg < 

Ap(^k){^*e)- Since Ap(fc) levels off at gk, this may not hold for x* > gk , but by Lemma lS.lOl bt- < ^—T^9k- 

Therefore Xe<bk< i^^Ap(fc)(x*) when x* > gk- 

Now let ye be the flow Fk routes on edge e when the current, stage k demands V^'^^ are used. By Lemma 
13. 1[ E[dt,] = dy for each v ^ V. Summing over all the demands that contribute to an edge's flow, we have 

E[ye\ = Xe- 

The cost of Fk with 6j cost per unit edge length is 



E 



h ^ leUe 



using > 1 and ap(^k) ^ (1 — 7)<Jfc from Lemma [3^ 

We can find an approximate LBFL solution that is a 27ri7'-approximation to the optimal cost and reduces 
the facility lower bound by a factor of at most 3. Therefore 

E[Pt] < 27rFE[Fk] < (2^F^^^) 

Sum over all values of k to bound the expected cost by 27rj7 L. □ 



Proof of Theorem \JM Combining the bounds in Lemmas [3.3[|3.1 1[ and 

7 — 7^ 1 — 7 



nfiTcMM)] < 4 { 2ttf _ + rr^^ ] L 



□ 
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This completes the analysis of A for 7-regular a. If arbitrary a can be 7-regularized for some < 7 < ^ 
it follows that i? = 0(l). 

Recent algorithms for SSBaB are based on the Gupta, Kumar, and Roughgarden (GKR) algorithm 
IIGKR03 [ 'GKPR07 1 . which achieves a better approximation ratio than GMM with a simpler analysis, and 
one may wonder whether we could reap the same benefits by basing our proof around this algorithm in- 
stead. One round of GKR is roughly equivalent to one round of GMM — starting with about (7^-1 demand 
at a subset of nodes and ending with about demand at a smaller subset — but the GKR analysis bounds 
the entire cost of a round using only one tree, whereas GMM requires two. However, each tree required 
by GMM can be easily constructed from some T* in 0{aiAi{T*)), but building the tree needed by GKR 
and within the right bounds seems trickier. Note that Lemmas l3.9l and [3.11l use two different trees, 
and T*^^y analyzed in two different ways, either fixed or linear cost per edge. Although this conveniently 
matches the GMM algorithm, it also required for the proof to work. Using only a single Steiner tree on a 
subset of the nodes as in GKR allows less flexibility, so a proof may require a different approach or more 
substantial changes to the original GKR analysis. 

4 Handling Arbitrary a 

Given any a, where Oi > 0, defining /(x), a concave cost function, and L, the multi-level cost, we need 
to find regular a' defining f'{x) and L' such that f{x) = 0{f'{x)) Vx, and L' = 0{L). Then applying 
Theorem[3l]to a' gives fiTcMu) = 0{L'), and 

f{TGMM) = OifiTcMM)) = 0{L') = 0{L) 

satisfying the precondition of Theorem 13.81 Note that we can allow / to grow and L to shrink arbitrarily 
in the transformation to /' and V , but we need to bound increases in L and decreases in /. By scaling by 
^ ■ Ui we may assume without loss of generality that Oj = 1. 

First, we prove a simple bound on the change between each term Ai (T* ) in L. 

Lemma 4.1. For any i and any k > 0, Ai{T*) < Ai+k{T*_^k) ^ 2*=Aj(7;*). 

Proof. Note Ai{x) < Ai+k{x) < 2''Ai{x) for A; > 0. Therefore 

AiT*) < Ai{T*^,) < A+kin^,) < Ai+kiT*) < 2^A,{T:) 

□ 

To regularize the values we run a through a series of three procedures, one for each of the following 
lemmas, each of which changes a to satisfy an additional set of constraints. None of the procedures are 
conceptually difficult, but the details are quite intricate. We will state the lemmas, give a brief sketch of the 
ideas, and present the complete proofs in the appendix. 

The first lemma is only a helper used in satisfying the a constraints. The proof serves as a warmup for 
the later lemmas, which use similar ideas but are more involved. 

Lemma 4.2. Given arbitrary a, we can find a' such that the corresponding f',L', 5', a' satisfy f{x) < 
f'{x), L' < 2L, and ^r— ^ < D, where K is the number of pipes, and D is the total demand rounded up to 
a power of 2. 

The following 2 lemmas perform the actual regularization. 
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Lemma 4.3. Given a satisfying ^ < D, we can find a' such that the corresponding f, L', 5', a' satisfy 
fix) < 3 fix), L' = 0{L), < D, and < -f5'^for all k. 

FC — 1 

Lemma 4.4. Given a satisfying f^— ^ < D and 6k+i < 7^^. can find a' such that such the corresponding 
f',L', 5', a' satisfy f{x) < L' = 0{L), 5',^^-^ < 7(5^, and a'f^ < ja^^^for all k. 

The proofs are based around the following idea: check if d^^^i > ^5^ or > 70"fe+i, and discard pipes 
that violate the constraints. The additional difficulty, relative to the analysis of GMM, arises from the special 
form that / must satisfy and the need to bound the increase in L. When we remove pipes in general the 
indifference points between subsequent pipes will no longer be powers of 2, so / can no longer be defined 
in terms of a. We fix this by modifying the parameters of an offending pipe until the new breakpoint is a 
power of 2. To avoid drastic changes in L or /, we achieve this by holding the cost of the given pipe k fixed 
at its indifference point with either — 1 of A: + 1 and "rotating" the line ak + 5kX around this fixed point 
until the other indifference point is fixed. 

Analyzing the increase in L caused by these procedures is the technical crux in the regularization anal- 
ysis, as removing pipes can shift "a-mass" in the multi-level cost onto much more expensive trees. We 
consider each pipe removal and the terms in L it affects. If a-mass is shifted from Ai{T*) to Aj+;(T^*^/)' 
where / = 0(1), then the current chunk of L has increased by 0(1). If not, we show that the conditions 
requiring / = uj{l) imply there exist large terms in L above i + / that can absorb the increase with only an 
0(l)-factor loss. We only charge against each L-term 0(1) times during the entire regularization, so the 
total increase is bounded by 0(1). 

We summarize the consequences of the regularization procedure below: 

Theorem 4.5. The algorithm A required by Theorem \2.2\ exists for a constant c, and the oblivious approxi- 
mation ratio Ri is constant. 



5 Open Problems 

A number of interesting open problems remain to be solved. First, we have only achieved an 0(l)-ratio 
for the objective Ri = maxfE[f{T)]/f{TJ), but Goel and Estrin IIGE03II have shown an 0(log|P[)- 

approximation for the much harder objective i?2 = E maxj f{T)/f{Tj) , proving there exists a single 

tree that is simultaneously an 0(log |Dj)-approximation for all f ^ T . Achieving a constant for this stronger 
objective or showing a lower bound remains an important open question. 

Second, although our algorithm proves that an 0(l)-approximate distribution exists, the ellipsoid algo- 
rithm tells us little about what these trees actually look like. A combinatorial algorithm that yields insight as 
to the actual structure of these trees would also be of interest. Third, we have made little attempt to optimize 
the constant c in the approximation ratio, and the resulting value is huge due to the regularization procedure. 
Shaving large factors off our bound on R\ may be a simple question, and it would be particularly interesting 
to find an oblivious approximation algorithm that is competitive with standard SSBaB for known /. 
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A Proofs of regularization lemmas 

Lemma |4.2[ Given arbitrary a, we can find 61 such that the corresponding f',L', S', a' satisfy f{x) < 
f'{x), V < 2L, and < D, where K is the number of pipes, and D is the total demand rounded up to 
a power of 2. 

Proof. Let k be the first pipe such that > D. Note A; > since ^ = 0. Remove all pipes above k. 
Now we modify the parameters of pipe k to satisfy the desired constraint. Increase 6k, while decreasing ak 
so as to hold + 6k2P^''~^^ fixed, until ^ = D. Geometrically, we are rotating the line y = (Tk + 5kX 
counter-clockwise around the point (2*'('^'~^\ ak + (5^2^^'^"^)). Let 5^, cr^ be the new parameters for pipe k. 
Let /' be the new cost function formed by modifying pipe k and removing pipes k + I, . . . , K — 1 and L' 
the associated multi-level cost. 

Claim: The function f'{x) is concave, and f{x) < f'{x) for all x. 

Initially 5k < 6k~i and ak > o-fc-i, and we continuously decrease ak while increasing 6k- We know 
ak-i + 5fc-i2^*^'^"^^ ~ '^fc + 6'f,2P^''^^\ so if we decrease to ak-i the modified pipe k will match 
pipe k — 1. However, we have that ?^ < D = so we stop before reaching that point. Therefore 

(j^ > (Tfc-1 and 6'i^ < 6k-i, which imphes f'{x) is concave since the switchover between pipes k — I 
and k is unchanged. We only increased the rate of growth for x > 2'p^''~^\ so f'{x) > f{x) for all x. 

Claim: The new multi-level cost L' is at most 2L. 

There is a term ap(j) for each changeover between pipes as well as the implicit breakpoint at D 
when / levels off. Increasing 6k and removing pipes k + 1, . . . , K — 1 so that pipe k is used all 
the way to D corresponds in L to pushing a-mass from the terms api^k-i)^p{k-i){T*^i^_-^^) + • • • + 

(^p(K-i)\{K-i){T*,^K-i)) oi^to '^fc^iog£'(^iogD) because p'{k) = logD. 

By the definition of cr^ and and Lemma [l!4] we have 



of the 11th annual ACM-SIAM Symposium on Discrete Algorithms, pages llQ-119, 2000. 




j<k 



14 



The terms 0^(0)5 • • • Q^p(A:-2) unchanged, and ap(fc_i) drops due the decreased difference between 
6k-i and 6^. There are no non-zero a[ between p{k — 1) and log D. This gives us 



Next we use LemmaOto relate Ap{j){T*f^j)) and ^iogD(71*g£,): 

2^0) 

(^fcAogDlTlogD) < X^"p(j)-^^logD(7l*gZ)) < X]ap0)^p(i)(7'p(j)) < 
j<fc j<A; 

Finally, L' = .)Ap(,)(T;(^.)) + S'^Ay^^T*^^^) < 2L. 

□ 

Lemma [4.31 Given a satisfying ^^^—^ < D, we can find a' such that the corresponding f',L', 6', a' satisfy 

— 1 

fix) < 3 fix), L' = 0{L), < D, and 6',^^^ < ^iS'^for all k. 

Proof. We repeat the following two steps until 5k+i < for all k. 

1. Deletion Step: The basic idea here is the same as that used by GMM Lemma 3.2 MGMMOlll to satisfy 

the constraints on the J's: whenever a pipe violates the constraint S^j^i > jS^, we remove the pipe. 

Let k be the smallest index such that dk+i > "ySk, and let I be the smallest integer such that 6^+1 < 
^dk- If such an / exists, then remove pipes k + 1, . . . , k + I — 1, and change f{x) in the interval 
j2P(fc)^ 2P{k+i-i)-^ using the cheaper of pipe k and A; + Z. If no such I exists then remove all pipes 
above k, and replace them with pipe k. Note that this does not break the condition set in Lemma |42l 

2. Rotation Step: Pipes k and k + I now have equal cost at some point g, but g may not be a power of 2, in 

which case /(x) is no longer in the form aiAi{x), and a' is no longer defined. 

We want to modify the pipes to change g while not affecting L or f too much. As in Lemma l42l we 
hold the cost of pipe k fixed when routing 2^^'''^^ flow (where we switch from A; — 1 to k), and reduce 
dk until pipes k and k+l meet at the next power of 2, increasing ak to maintain k's cost at 2^^^^^^ . This 
corresponds to rotating the line y = (Jk + ^kx clockwise around the point (2^^'^"^^ ak + 5k2P^^~^^). 
Let (5^ and a'^ be the new parameters for pipe k. Note that f'{x) now has the proper structure again, 
and a' and L' are well-defined. We never increase gq above since we hold this point fixed when 
adjusting pipe 0. 

First, we bound the change to 5k in the rotation step. This allows us to prove that the constraints on the 
5's are satisfied, and f{x) decreases by at most an 0(l)-factor. 

Claim: After rotation (5^ > ^. 

Before adjustment, we are indifferent between k and /c + Z at {g, yk) where yk = crk + 6kg = <7fc+/ + 
^k+W- The difference in costs between k and fc + / at 2^^^"^) flow remains unchanged because we 
hold the cost of pipe k fixed at 2P^^~^\ Let Xk = g - 2P^^~^\ the distance after 2^^^^^^ at which 
their costs are equal. Before rotation, the pipes' costs approach each other at a rate of 5^ — Sk+i- If we 
reduce 6k by a factor of 3, then ^ — 6k+i < \{6k — 6k+i), so it takes at least 3xk for pipe k to grow 



15 



2P(k-l) 2' 2'+' 

Figure 1: To ensure the indifference point between pipes k and /c + / is a power of 2 we "rotate" pipe k 
around it's starting point until it meets A; + / at a power of 2. 

from (Tfc + (5fc2P('^'~^) to yk, during which pipe k + /'s cost only increases, so pipe k does not surpass 
k + / until after 2p('^-i) + 3xfc. 

The original pipe k met pipe k + 1 (now removed) at some point 2^^^^) > 2^^^^^)+^ before meeting 
k + 1 at g. Therefore g > 2p(^"1)+i, which impUes Xk = g - 2^^^^^) > |. After reducing 6k to ^, 
pipes k and A; + / now meet after 2P^^~^^ + 3xk = g + 2xk > 2g. There must be a power of 2 between 
g and 2g, and we reduce 6k only until we hit the next power of 2, so > 

Claim: When the procedure is finished 6'^^^-^ < 7(5^ for all k. 

By the choice of /, 6k+i < ■^6k < 7(5[., using the previous claim. Further 6'f^ < 6k < 'y6k-i, so no 
previously-satisfied constraints are broken. We renumber the pipes, and repeat the process for the next 
constraint violation. When we ai^e done, all the remaining pipes will satisfy 5^^^ < j6'j^. 

Claim: For all x, f{x) < 3f'{x). 

Note that removing pipes k + l,...,k + l — 1 only changes / in the interval (2^^'^"^), 2^^'^'+'^^)), and 
we only remove or adjust pipes in this interval once. Initially, removing pipes can only increase f{x), 
but then we reduce 6k by a factor of at most 3, which may decrease /(x) by a factor of at most 3. 

Now, we must bound the potential increase in L. To avoid confusion due to relabeling indexes after 
removing pipes, we change notation slightly. Suppose the procedure completes after K' iterations. Let 
Q!p,^Q-| , . . . , api(^x'-i) ^"^^^ non-zero a's, and 0^(0) , • • • , ap(ft:-i) the original a's. For < < i^' — 1 

let ap(sj,), • • • , ap(sj._,_i-i) be the L-terms affected by the A;th iteration of the procedure: either they are 
removed and merged into ap^^^-j or C('pi(^f,^ = <^p{sk) if the constraint is already satisfied. We need to analyze 

how mass is shifted between terms in L. Define Lk = Yli^sl ^ ^p{i)^p{i)('^p{i))' the portion of L that 
round k affects. 

Consider round k in which we remove old pipes + 1, . . . , Sk+i — 1 and adjust 6'f^. The old 6s^_^_i be- 
comes 5^^]^. Rotating 6'^. increases a'p,(^k-i) because a'^if^k-i) ~ ^k~i~^k but reduces the total a-mass above 
p'{k - 1) because (5^ = Ei>fc«p(j)' decreasing L. The remaining a-mass on ap(sk)^p{sk){Tp(^Sk)'>^ ■ ■ ■, 
Oip(sk+^-i)\{st,+^-i){T*^Sk+i-i)) merges into a'p'(^k)\'{k)iT*,^^k)) wherey(A;) is somewhere between p(sfc) 
and p{sk+i)- If mass from some ap(j) moves down to ca'^i^^k) where p'{k) < p{i), then we can ignore it, as it 
will only reduce L. If it moves up, then we will charge the increase to some higher term in L. 

Let C5 < ^ be some small constant. There ai^e 2 cases to consider: either 6s^^^ > cs6'f^ or (Jg^.^^ < c^^^. 
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Case 1: csS'^ > Ss^+^ = S'^^^. 

Intuitively, this means there is a big drop between > ^S'^ and 5s^.^l < csS'f,, so Q!p(s^^^_i) 

must be fairly large: ap(s^^i-i) = Ss^._^.l-l — ^s^._^.l > (3 — '^s)^k- We will charge any increase in 
L this iteration to the term «p(sfc_,.i-i)^p(sfc+i-i)(^p(sj^.^^_i))- Note that we are always in this case 
when we remove the last pipe because we can view the last pipe as intersecting a dummy pipe with 
(5 = at D. 

In order to bound ap,^k)^p'{k){T*,^k)) "p(5fc+i-i)^P(sfc+i-i)(^p%fc+i-i)) we must show that p(sfc+i- 
1) > p'{k). Note 2P is the cost at which the new, rotated pipe k surpasses the old pipe Sk+i- New 
pipe k intersects pipe s^+i — 1 before s^+i, and S'j^ > Sg^^-^-i, so pipes k and s^+i meet before 
Sfc+i — 1 and Sk+i do. Therefore g < 2P^'^'^+'-^^\ and when we reduce 6'/^ to fix the breakpoint we 
never need to raise g beyond 2*'(**+i~^) before hitting a power of 2. Therefore 



^ (I - ^0 ^k\{s,+,-i){T;^s,+,-i)) (by assumption) 

j>fe 

We can charge the increase in Q!p/(fc) to Qp(^^^^_i) in the current chunk L^, with a loss of (| — c^) ^ = 
^_3^^ , and this charge can only occur once for each L^. 

Case 2: < 5s^^_^_^ ■ 

In this case there is no large collection of mass that we can easily guarantee is above p'{k) in the 
current interval, but we do know there must be a lot of mass somewhere above p(sfc+i — 1) because 
is large. The a-mass ap(^sk+i) + ■ • • + «p(sfc+2-i) = ^sk+i - is "used" in the next iteration 
and contributes Lk+i to L. We know 'y5s|^_^_J^ = jS'^^^ > 5s^_^_^, which impUes Yltt^sl+l = 
^sk+i ~ > (1 ~ 7)^sfc+i- Now we can bound the increase 

- ^■'^k+i^P'ik){Tp(^k)) (by assumption) 

^{^^-^)[t^ i «P» J ^P'(fe)(^p'(fe)) (shown above) 



«=Sfc+l 



1 - C5 



C5(l - 7) 



Therefore we can charge the increase in L due to iteration k to the portion L^+i used in the next 
iteration. 

For a particular segment of L, the — 1th iteration may been bounded by cgii'l^-y) increase in L^, 
and the fcth iteration may charge against a t-Stt increase. Each type of charge can occur at most once per 
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chunk. Therefore the total increase in each piece, and hence the total increase in L = Lk is 

l-cs ^ 3 



This completes the proof. □ 

Lemma [4.4l Given a satisfying ^^-j < D and 5^+1 < ^6k, we can find a' such that such the corresponding 
f',L', 6', a' satisfy f{x) < L' = 0{L), 6',^^-^ < 7(5^, and a'f^ < ja^^^for all k. 

Proof. The proof follows Lemma 1431 but moves backwards through the pipes rather than forwards. 

1. Deletion Step: Let k be the highest index such that (Jk-i > 70"^, and / > 1 the smallest integer such that 

ak-i < ^Cfc- Such an / must exist because o"o = 0. Remove pipes k — I + 1, . . . , k — 1, and replace 
them with the cheaper of pipes k — I and k. 

2. Rotation Step: As in Lemma l431 f{x) may no longer be a linear combination of terms Ai{x) because the 

new indifference point may not be a power of 2. We use a similar procedure as before to remedy this. 
Hold pipe k's cost for 2^^''^ flow fixed, and reduce while increasing 6k to maintain the invariant 
until k and k — I meet at a power of 2. Geometrically we are rotating y = ak + 5kX counter-clockwise 
around (2^^'^'), + 5^2'^^^^). Let o"^, be the new parameters. Note that a' and L' are now well- 
defined. 

First, we analyze the change to ak and 5k required by the rotation step and use this result to prove the 
constraints on both the cr's and 5's are satisfied at the end without changing /(x) too much. 

Claim: After rotation cr^ > and 5'f, < ^5k- 

Suppose the unmodified pipe k and k — I meet at (7 = ^^-pj^- We will bound the adjustment required 
to guarantee they meet before |. Reduce ak to |crfc = cr^. The modified pipe k has the same cost as 
the old at 2!p^^\ If k is the final pipe then from Lemma l42l we know D = 2*''^'^) > Otherwise, pipe 
k costs the same as A; + 1 at 2P^'^\ so we have that 2p(*') = "s^^^Sk'^l > f^, using 70-^+1 > ak (the 
constraint fixed in the previous iteration). In either case 5k2^^^^ > ak- Now, 

afc + 42^'(^) = ^afc + 5^2^'W 
5 

The constraints on the ds were satisfied before removing pipe A; — 1, so 6k-i > -^Sk- This implies 



5k ^ 4 _^ 1 1 



5k-i - 5k -ij5k -5k 4-1 3 
using 7 < ^. We combine this with the bound on 5'f^ to bound the change in 5k-i — 5^: 

5k-i - f^fc > i^k-i - ^k) - |<5fe ={5k-i - 5k) (1 - ^- — ^ ] 

5 V 5 5k-i-dkJ 



>{5k-i ~ ^k) ( I - ^ ■ -^j = -^(^k-i - h) 
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Now we have enough information to bound the new switchover point. 



cr^ — ak_i _ — (Tk-i ^ |(cfc — (Jk-l) ^ fC^fc — Cfk-l) _ I ak — (Jk-l _ 1 
h-i-5'k h-i-S'f^ ~ 5k-i-S'j^ ~ ^{6k-i - dk) '^^k-i-h 2 

There must be a power of 2 between | and g, so we need to reduce ak by at most a factor of |. Finally, 
note that pipes and 1 meet no sooner than 1 , and k > 1 since it is always ti^ue that 70"! > cjo = 0. 
Therefore g > 1, and hence the new changeover point is at least 1, so we do not need to worry about 
a term 

Claim: When the procedure finishes < jS'j^ and cr^ < 70"^_,_i for all k. 

We chose I such that ak-i < x'^fc' ^k-i < 70"^- Before starting, we had ^"^Sk-i > J^k-i > ^k^ 
and 7 < |, which implies 5^ < ^6k < |7^(^fc-/ < ^^k-i- Note that the rotation step does not break 
any previously-satisfied constraints on larger /c's. 

Claim: For all x, f{x) < 

Only 1 round affects the interval (2^'^'^"'"^^ 2^'^'^)). Removing pipes only increases f{x), and if we 
adjust (Tfc, then it decreases by a factor of at most |, while 6k increases, so f'{x) > ^f{x). 

Now we analyze the increase in L. First, unlike in Lemma 1431 the rotation step works against us, and 
we need to bound the increase. 

Claim: Rotation only increases L by an 0(l)-factor. 

When adjusting pipe k, we increase 6k without changing 6k+i, which increases Opf^k)- We have that 
ap(fc) > (1 - 7)Sk, and (5^ < so 

= s'k-h^i < (Sk-Sk^i) (1 + ^^^4^) < (1 + Ij;^^) = ^r^«pw 

causing L to increase by at most 



Second, we need to bound the increase in L caused by removing pipes. Let K' be the number of iter- 
ations and final pipes and ctp/^g), • • • >CKp/(x'-i) resulting non-zero a's. Iteration A;, for 1 < A; < K', 

deletes pipes Sfc+i+1, . . . , Sfc-1 which removes ap(,,+i), • • • , ap(.fe-i)- LetLk = Ei=7,Vi "p(»)^p(»)(^p(i)) 
be the amount these contribute to L. Since it moves backwards through pipes the indices of' new pipes are not 
fixed yet, but as labeled at the end, round k ensures a'j < 70'^4_| and creates a term ctp/^j) where j = K' — k. 

The rotation step reduces both Op/^^) and p'{j) which can only help in this step, and we have already 
bounded the increase in a^,^^^^^ due to rotation, so we assume that no rotation is needed. This implies 

'^'p'U) ~ '^■^'=+1 ~ ^^i: ~ '^i=Sk+i '^p{i)- Lemma |43] we need to ensure that too much a-mass does not 
move too high. 

Let Cct < ^ be a small constant. We need to consider two cases again: either < CaC^j_^_^ or 

Case 1: ag^^^ < CaCrj_^_^. 

Intuitively, this means as^^^j+i is much larger than (Jg^^-^ because ag^^-^+i > ^ct^+i, so by the time 
pipe Sfc+i catches up with pipe Sk+i + 1 or any later pipe, it has already covered an 0(l)-fraction 
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of the distance to 2'p'^\ Therefore, pushing mass from up to Api^j^{T*,^j-^) increases L by only a 
constant factor. 

We bound 2^'^^^ by bounding the cost to which pipe Sk+i must grow before switching pipes. Before 
removal the old pipe Sk — I crossed the new 7 + 1 at 2^'^*'="^) = "V;~^ < 1 — < tt^, 

SO o-'j^i + 5^^^2P(**^^' < 2a'j^^. Pipe Sfc+i's cost increases faster than — I's and surpasses s^'s 
cost before 2^'(^'=-^). Therefore a'^^^ + < 2a'j^^. 

We know as,._^_^+i > or else it would not have been removed. When s^+i intersects s^+i + 1 

at 2^'(*'=+i) it has grown from oSk+i ^ l^^st (T5j.^j+i and therefore has covered at least 

fraction of the distance to the indifference point between s^+i + 1 and s^. Therefore 
Every other affected ctp^j) is pushed up less than ctp^s^.^^), so 

i=Sk+i 
Sfc-1 

< 



Case 2: (Js^_^j > CctCTj+i. 

In this case pipes s^+i and s^+i + 1 may meet very early, and ApiQ^{T*,^^) could be much bigger 
than Ap(sj^._^^) Note that we are never in this case when (Tsj^._,_i = 0. We have that 

a^+, + 5,+i2^'(^)=a.,,,+5,,,,2^'(^) 



a 



After the next round — which we know occurs because (Jg^^-^ 7^ — 0-3^+2 ^^^^ PiP^ preceding 

as^^-^ (which is a'j). Using a.^^^.^ < l'^s^+-,, it is easy to see that as^^^ < '''''^\_'!^''"^^ and from the 
formula for a,,^^ we have a,,^, - a,,^^ = «p(i)2*'^') 
Combining the previous inequalities. 



-\ Ca J 2P'U) ~\ Ca J \ 1-7 y 2P'(i) 



Ca(l-7) . 

l=Sk+2 
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Now we can apply Lemma |4TT] to finish the bound: 

''Ca{l-l) 



«=«fe+2 
Sfc+1-1 



1 ^T-J^ 

*— *fe+2 



Therefore we can charge the increase in this iteration to Lfc+i used in the next iteration. 

10 



For a particular chunk of L, round /c's increase may be bounded by a ^^iq^ — factor increase and 



round k — 1 may be bounded by a ^^^^^^ -factor increase. Each charge only occurs once. The rotation step 

(1-7) 



adds another factor of on top of this. Therefore, the total growth of L is at most 



57 _^ 10 



5(1-7) Vca(l-7) 27-5c,^ 

□ 
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